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in this project an algorithm to automatically detect WBC and subsequently 
examine ALL disease using Convolutional Neural Network (CNN) 
Keywords: is proposed. Several pretrained CNN models which are VGG, GoogleNet 
and Alexnet were analaysed to compare its performance for differentiating 
lymphoblast and non-lymphoblast cells from IDB database. The tuning 
is done by experimenting the convolution layer, pooling layer and fully 
White blood cell connected layer. Technically, 70% of the images are used for training 
and another 30% for testing. From the experiments, it is found that the best 
pretrained models are VGG and GoogleNet compared to AlexNet 
by achieving 100% accuracy for training. As for testing, VGG obtained 
the highest performance which is 99.13% accuracy. Apart from that, 
VGG also proven to have better result based on the training graph which 
is more stable and contains less error compared to the other two models. 
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1, INTRODUCTION 

Leukemia is one of blood cancer disease that is highly related to White Blood Cell (WBC) and can 
cause fatal and death [1]. WBC is closely related to human immune system which helps to fight diseases 
and viruses [2]. Human are always surrounded by harms, bacteria and viruses every day, hence a strong 
immune system is needed. WBC analysis is very crucial and greatly helps to monitor our immunity level 
for early prevention. It is also potentially can be diagnosed with diseases such as HIV, Lymphoma 
and Leukemia. In this paper, Acute Lymphoblastic Leukemia (ALL) is detected from WBC region in blood 
smear image as illustrated in Figure 1. 

WBC has five types which are Neutrophils, Eosinophils, Basophils, Monocytes and Lymphocytes 
as shown in Figure 2 [3]. Basically, ALL comes from the presence of Lymphoblast which is the abnormal 
cell of Lymphocytes. It can be differentiated by its shape irregularities, small cavity in the cytoplasm, 
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spherical particles within nucleus and the number of lobes in the nucleus [4]. This disease commonly attacks 
25% of young children below 15 years old [5]. 

Conventional method of detecting ALL is by manual analysis which the pathologist needs to review 
the blood sample image manually [6-7]. It is highly dependent on the pathologist’s skill and experience [8]. 
Among pathologist itself might produce different result and it creates confusion. Other than that, 
as the sample increase, it will be more challenging for the pathologist and it is also time consuming [9]. 
Blood smear image consists of non-uniform illumination which will harden the pathologist’s work [10]. 
Automated system also can help to aid pathologist in the blood diagnosis [11]. Other than that, computer 
aided system by using machine learning is proposed to identify lymphoblast and detect ALL. 
However, machine learning consists of complicated process of segmentation, feature extraction 
and classification [12, 13]. It is also challenging as the classification result is highly dependent 
on the selection of features. If the features selected is insignificant, classification accuracy will be affected. 
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Figure 2. Illustration of five types of WBC in a blood smear image [3] 


In this work, computer aided using Convolutional Neural Network (CNN) which is deep learning 
model is done to cater the limitations. The biggest advantage of deep learning is its ability to learn 
the object’s features, no complex classifier design needed and its performance is high [14]. Other than that, 
it is widely used in medical field due to its ability to achieve impressive performance [15]. Deep learning also 
can be defined as a class of machine learning techniques that exploit many layers of non-linear information 
processing for supervised or unsupervised feature extraction and transformation and for pattern analysis 
and classification [16]. CNN process an input data by its multiple layers which consists of four key features: 
local connections, shared weights, pooling and the use of many layers [17]. Deep learning in medical benefit 
is exploit nowadays. There are many research and applications using deep learning that can be found 
previously such as image classification for Malaria diagnosis which use the AlexNet pretrained model [18]. 
Other than that, AlexNet of CNN is also used for fire detection and it achieved stable accuracy as reported 
in [19]. CNN architecture that consiste of 5 layers of convolutional, pooling and fully connected layer 
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is proposed to detect the subtype of WBC [20]. Face and non-face image classification is reported in [21] 
to shows an outstanding performance. Other than that, there are also research on ALL identification which 
compares the result of machine learning and CNN and CNN showed the best performance result [22]. 
Some works combine CNN with Recursive Neural Network (RNN) to classify types of WBC [23]. 
Lastly, the most related work to ours is as reported in [24] which used AlexNet pretrained model to identify 
lymphoblast and detect ALL. Rest of the paper is organized as follows: Section II describes the proposed 
framework and methodology of the process, section HI presents the result of the implementation 
and analysis. Section IV provides the conclusion and future works. 


2. RESEARCH METHOD 
2.1. Dataset 

Images used were from ALL_IDB which consist of public blood smear image that contains of ALL 
effected cell and healthy cell as shown in Figure 3. There are three main elements in the blood smear image 
which are Red Blood Cell (RBC), white blood cell (WBC) and the background. The deep colored purple 
is the WBC region while the pinkish or lighter purple is the RBC region and other than that is considered as 
the background. The first row shows the lymphoblast cell and the second row shows non-lymphoblast cell. 
It can be differentiated by its shape irregularities, small cavity in the nucleus, spherical particles within 
nucleus and number of lobes in the nucleus. In this paper, there are 30 images of lymphoblast and 30 images 
of healthy cell were examined to detect ALL. These images resolution are the same which are 257x257. 
However, the input size is resized based on the pre-trained model used. There are 60 images in total and 70% 
of it is used for training and 30% is used for testing. The software language used is Matlab with R2018b 
version. While for completing the algorithm, deep network designer in Matlab software is used. 
All the layers used for CNN model such as convolution layer, fully connected layer and pooling 
is in the mentioned toolbox. 

















Figure 3. Lymphoblast and non-lymphoblast 


2.2. Convotional neural network 

In this paper, we compare the pre-trained models of AlexNet, GoogleNet and Vggl6 for ALL 
detection. The CNN key operation is as shown in Figure 4. After input image is extracted, the filters with 
learned weights to generate feature maps in done in the convolution part. Basically, image size will differ 
after the convolution layer. Non-linearity 1s often done by using Rectified Linear Unit (ReLU) to minimize 
the features vector. Lastly, fully connected is the classifier which act to classify the lymphoblast 
and non-lymphoblast cell. In every pre-trained models that were used, we changed the last fully connected to 

2 class output using softmax function. The mini batch size and epoch is set to 10 and 6 respectively 

for all models. 

—  AlexNet: It contains of 8 layers which consists of 5 convolutional and 3 fully connected layers as shown 
in Figure 5. Input image for AlexNet is RGB image with resolution of 227x227. Firstly, 11x11 
convolution mask is used over 227x227 input image followed by 5x5, 3x3, 3x3 and 3x3 convolution 
mask. AlexNet used ReLU for the non-linearity function. 

—  GoogleNet: It contains 22 layers as shown in Figure 6. Input image required for GoogleNet is slightly 
different from AlexNet. It requires RGB input image with 224x224 resolution. GoogleNet has addition 
of inception layer which is used to convolve in parallel different sizes from 1x1 to 5x5 and it is done by 
applying Gabor filters with different sizes. It is also widely known by its ability to cater image related 
problems [22]. 
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VGG:This pre-trained model consists of 19 layers as shown in Figure 7 and the input image is from 
the resolution of 224x224 which is same as GoogleNet. VGG uses 3x3 filters with stride of | in 
convolution layer and it also used the same padding in pooling layer which is 2x2 and the stride is 2 





Fully connected Dropout (optional) Pooling (optional) 


Figure 4. CNN key operation 
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Figure 6. GoogleNet architecture 
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Figure 7. VGG architecture 


3. RESEARCH METHOD 


Images of two classes of Lymphoblast and Non-lymphoblast cell is trained by using 3 pre-trained 
models which are AlexNet, GoogleNet and VGG. The parameters such as mini batch size, epoch, iteration 
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percentage of training and testing are set the same which are 10, 6, 24, 70% and 30% respectively. 
Figure 8 shows the training process for AlexNet, GoogleNet and VGG model. Based on the graph, VGG 
model is more stable as it achieves 100% validation accuracy at iteration of 12 until the rest of iteration 
compare to GoogleNet which is at iteration 18. It shows that VGG achieves 100 % stability faster 
than the other two models. 

The comparison of three models is made in terms of its training and testing accuracy and the elapsed 
time for training process. It can be seen that training accuracy of AlexNet is 94.44% while for GoogleNet and 
VGG, both training achieve 100% accuracy. The result has been tabulated in Table 1. In the training process, 
AlexNet training accuracy is the lowest as the layer is lesser and might not be significant for ALL detection. 
However, the elapsed time is the shortest compared to GoogleNet and VGG. It is due because the AlexNet 
model has the fewest layer which is 8 layers compared to GoogleNet and VGG which are 22 
and 19 layers respectively. 
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Figure 8. Training graph of, (a) AlexNet, (b) GoogleNet 
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Training Progress (22-Jan-2019 16:51:34) 
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Figure 8. Training graph of, (c) VGG (continue) 


While in Table 2, comparison of testing result for lymphoblast and non-lymphoblast is tabulated. 
It can clearly be seen that for both lymphoblast and non-lymphoblast classification, VGG model obtains 
the highest accuracy. The illustration of testing classification for AlexNet, GoogleNet and VGG is depicted 
in Figure 9(a) and Figure 9(b) respectively. Both AlexNet and GoogleNet contains error in classifying 
Lymphoblast cell. In 30 images, there 1s one image that is misclassified. While for VGG model, all images 
are perfectly classified to its class and category. Best result can be achieved by a model that contains layers 
between 16-19 as reported in [25]. Other than that, VGG is also reported to have advantages of stacking 
multiples convolutional layers with small-sized kernels which can improve the effectiveness of receptive 
field of the network [25]. 


Table 1. Accuracy of training and testing Table 2. Accuracy of testing for lymphoblast 


CNN Model _ Training (%) nouns Elapsed time _ and non-lymphoblast 
oe CNN Model — Lymphoblast (%) | Non-Lymphoblast (%) 
AlexNet 94.44 97.16 32 56 AlexNet 97.66 06.66 


GoogleNet 100 91.45 1 min 57 sec GoogleNet 89 58 93.32 
VGG 100 99.13 8 min 35 sec VGG 99.77 98 49 
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Figure 9. (a) Confusion matrix of AlexNet and GoogleNet, (b) Confusion matrix of VGG 
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4. CONCLUSION 

In this paper, comparison of three different pretrained models of AlexNet, GoogleNet and VGG was 
made to classify Lymphoblast cell for ALL detection. These models have different number of layer 
and the layers are differentiated from each other. It was compared using Convolutional Neural Network on 
Matlab. There were 60 images in total which consist of 30 lymphoblast images and 30 non-lymphoblast 
images. In this project, 70% of the images was used for training and 30% was used for testing purposes. 
As a result, VGG was able to classify lymphoblast and non-lymphoblast cell correctly compare to AlexNet 
and GoogleNet. In the training process, VGG and GoogleNet able to achieve 100% accuracy. While in 
the testing assessment, VGG still maintains its high accuracy by obtaining 99.13% which is the highest 
compared to AlexNet and GoogleNet which obtained 97.16% and 91.45% respectively. VGG also proven 
the best from its confusion matrix. There are no misclassification of cells by using VGG model. While for 
AlexNet and GoogleNet, there was one image that has been misclassified by the models. 

In the future, the authors expected the system to be better by modifying the best pretrained model 
for WBC which is VGG to improve the result and accuracy. Next, the authors wish to increase the sample 
image and cater the illumination problem. Lastly, the algorithm is expected to be able to classify all five 
types of WBC which are lymphocyte, monocytes, neutrophils, basophil and eosinophil. 
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